updated: 2022-01-23_12:32:31-05:00
Non-Context Free Languages
Pumping Lemma
Introduction and Givens
let G be a CFG for a CFL L
let b be the maximum number of terminals on the RHS of a rule
let w be the string derived by G
let v be the number of variables
w = w1, w2, w3...
S => ... => ... => ... => w
^ length (number of derivations) is k steps long
At every step in derivation, we can get at most b new terminals
Example
S > aTb
T > aU
U > bV
V > baV | e
S => aTb => aaUb => aabVb => aabbaVb => aabbab
Number of derivations is 5. 5 is more than 4 (variables)
$\therefore$ one variable must loop
length of w for derivation of length k:
$|w|\leq kb$
$kb\geq |w|$
$k \geq \frac{|w|}{b}$
A derivation of length k contains
- How many variables if repetition is allowed?
- How many variables if repetition is not allowed?
If we pick k > |v| then at least two intermediate steps will contain a variable
i.e. A variable will be repeated in derivation
To ensure this repetition we choose w such that $k \geq |v|$
- $\frac{|w|}{b}\geq |v|$
- $|w|\geq b\star |v|$
- Choosing this guarantees a repetition of a variable in the derivation
$S \stackrel{}{\Rightarrow} uAz \stackrel{}{\Rightarrow} vAy\stackrel{\star}{\Rightarrow} w$
A is the variable that is repeating
$|w|\geq b\star |v|$
S{
u
A{
v
A{
x
}
y
}
z...
}
If you have a grammar like this, we can get no repetitions, one repetitions, or many repetitions
What does this mean?
$uv^{k}xy^{k}z\in L$ for k $\geq$ 0
if k = 0 -> uxz
if k = 1 -> uvxyz
if k = 2 -> uv$^2$xy$^2$z
We want to come up with a string that has has repeats
Theorem
- If L is a CFL, then there is a number P > 0 called pumping length such that if w$\in$L and |w|$\geq$p, then w can be written as uvxyz where:
- |vxy| $\leq$ P
- vy $\neq$ e
- uv$^{K}$xy$^{K}$z $\in$ L for all k $\geq$ 0
Proof:
Let L be a CFL and let G be the CFG that generates L
Let b be the maximum number of symbols on RHS of any rule
V is the number of variables
Let p = max(b$^{|V|}$+1,b$^{|V|+1}$)
Parse tree:
Number of steps is the same as the height of the parse tree
max length of string is b$^h$
consider the smallest possible parse tree for w
Subtree w:
Parse tree
because |w|$\geq$ b$^{|v|}$ the height of the tree must be greater than |v|
Let $\pi$ be one of the longest paths in the tree, length of $\pi$ > |v|
this means $\pi$ contains repetition among the bottom |v| + 1 variables
then w = uvxyz is generated by repeating the portion of the tree that denotes "VAY"
parse tree can be constructed for every string of form uv$^{K}$xy$^{K}$z $\in$ L for all k $\geq$ 0
height >= |V| implies repetition
Example: a^n b^n c^n
L = {$a^nb^nc^n|n\geq0$}
Suppose L is CFL
let p be the pumping length provided by PL
consider $w\in L$ and $w = a^pb^pc^p$
clearly $|w|\geq p$
According to PL, w can be represented as uvxyz where:
1. |vxy| $\leq$ P
2. vy $\neq$ e
3. uv$^{K}$xy$^{K}$z $\in$ L for all k $\geq$ 0
vxy could be anywhere in the string. Can't say it's all a's or b's because of that. There are multiple cases
Two cases:
- v or y contain more than one kind of symbol (2 symbols, either a & b or b & c)
- let k = 2
- $uv^2xy^2z$
- v or y have 2 characters, so repeating them means they are out of order when power 2 applies
- so if v x y was aa a ab => aaaa a abab
- now there is an a after a b...
- also, the number of b's and c's is different
- v & y each contain one type of symbol
- $uv^2xy^2z$
- Options:
- v is all a's; y is all b's
- $a^{p+|v|}b^{p+|y|}c^{p}\notin L$
- v is all b's; y is all c's
- $a^pb^{p+|v|}c^{p+|y|}\notin L$
One of these cases must occur, $\therefore$ a contradiction must occur
- $a^pb^{p+|v|}c^{p+|y|}\notin L$
- v is all a's; y is all b's
Example: ww
L = {ww} (not a palindrome)
bla bla proof stuff
- |vxy| $\leq$ P
- vy $\neq$ e
- uv$^{K}$xy$^{K}$z $\in$ L for all k $\geq$ 0
let w = $0^p1^p0^p1^p$
w consists of 4 blocks of length p
| vxy | $\leq$ p => vxy can can touch at most two blocks
Case 1
suppose that v and y are both contained within a single block (same symbol)
say block 1. ie: vxy is all 0's
let k = 2
$uv^2xy^2z$ => $0^{p+i}1^p0^p1^p$ for i > 0
$uv^2xy^{2}z\notin L$
if v & y are all 1's:
$0^p1^{p+i}0^p1^p$ is also $\notin$ L
Now, for HW, look at multiple cases
Case 2
suppose v & y each have two different symbols
0's and 1's mixed up
$uv^2xy^2z\notin L$ because the 0's and 1's are not in order
therefore not in the language, and since neither is in language, not a CFL
...
Example: L = {a^i b^j c^k | 0 <= i <= j <= k}
Let L be a CFL and let p be the pumping length given by a pumping length
w = a$^p$ b$^p$ c$^p$
- we are choosing this because we know a^n b^n c^n not cfl, but a little different since we will have to pump up due to the less than
|w| $\geq$ p
therefore according to pumping lemma w can be represented as uvxyz where:
- |vxy| $\leq$ P
- vy $\neq$ e
- uv$^{K}$xy$^{K}$z $\in$ L for all k $\geq$ 0
Case 1
Both v and y contain the same symbol
three subcases.. can't be generalized
- v & y are in the middle (all b's)
- b^p => vxy
- pumping down ... $a^{p}b^{p-|m|}c^{p}$
- |a| > |b|
- soooo
- $uv^{0}xy^{0}z\notin L$ because |a| > |b|
- v and y are in the 1st part of the L (all a's)
- pumping down won't work, so we need to pump up
- Pumping up
- k = 2
- $a^{p+|m|}b^{p}c^{p}\notin l$
- v and y are in the last part of L (all C's)
- pump down
- $a^{p}b^{p}c^{p=|m|}\notin L$
- pump down
Case 2
both v and y contain different symbols
can be generalized:
- v is all a's and some b's ; y is all b's
- v is all b's and some c's; y is all c's
OR - v is some a's
- b is some a's and some b's
you can say "either v or y contain 2 different symbols" (thanks Nick!)
either way, the order is going to be wrong
- like if v = aaabbb, v^2 is aaabbbbaaabbb
- now a's are after b's
Pump up: $uv^{2}xy^{2}z \notin L$
blabla, therefore not in set of CFL etc
Homework by Friday:
(from book)
2.30:
a) {0$^n$ 1$^n$ 0$^n$ 1$^n$ | n $\ge$ 0} is not a CFL
d) {t$_1$ # t$_2$ # ... # t$_k$ | k $\ge$ 2, each t$_i$ $\in$ {a,b}$^{\star}$ and t$_i$ = t$_j$ for some i $\ne$ j}
2.33:
show that f = {a$^{i}$ b$^{j}$ | i = kj for some positive integer k} is not a CFL
Example: L = {1^i # 1^j # 1^{i+j} | i <= j}
" CFL Pumping lemma stuff
choose w = 1$^{p}$ # 1$^{p}$ # 1$^{2p}$
uvxyz
Case 1
either v or y contains #
- uv$^{2}$xy$^{2}$z => 1$^{p}$##1$^{p}$#1$^{2p}\notin L$
- uv$^{0}$xy$^{0}$z => 1$^{p}$1$^{p}$ # 1$^{2p}\notin L$
^^ one is pumping up, one is down, we don't need to do both, just show one
She is reexplaining this case...
uv$^{2}$xy$^{2}$z => #1#1#1^p ... too many hashes
probably ignore this to not get confused
Case 2
neither v or y contain #
v & y could be first, second, or third block of the string (like between the #)
they will fall within a block
- first block (v & y are in first block)
- length of $|vxy| \le p$, $\therefore$ v and y are in 1st block
- soo, we should pump up
- uv$^{2}$xy$^{2}$z$\notin$ L because i > j
- second block (v & y are in second block)
- pump down
- uv$^{0}$xy$^{0}$z $\notin$ L bc i>j
- third block (v & y are in third block)
- pump up
- uv$^{2}$xy$^{2}$z => 1$^{p}$#1$^{p}$#$^{2p+|m|}\notin L$
- p+p $\neq 2p + |m|$
to be continued